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Abstract 

Biomolecular structures are assemblies of emergent anisotropic building blocks, uniaxial helices 
and biaxial strands. We provide a conceptually novel approach to understand a marginally compact 
phase of matter that is occupied by proteins and DNA. This phase, that is in some respects 
analogous to the liquid crystal phase for chain molecules, stabilizes a range of shapes that can 
be obtained by sequence independent interactions occurring intra- and inter-molecularly between 
polymeric molecules. We present a singularity free self-interaction for a tube in the continuum 
limit and show that this results in the tube being positioned in the marginally compact phase. Our 
work provides a unified framework for understanding the building blocks of biomolecules. 
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I. INTRODUCTION 



The structures and phases adopted by inanimate matter have traditionally been un- 
derstood and predicted by simple paradigms, e.g. seemingly disparatephenomena such as 
phases of matter, magnetism, critical phenomena, and neural networks 1| have been success- 



fully studied within the unified framework of an Ising model [2|]. Liquid crystals [3|], whose 
molecules are not spherical, form several distinct stable, yet sensitive structures. They pos- 
sess translational order in fewer than three dimensions and/or orientational order and exist 
in a phase between a liquid with no translational order and a crystal with translational order 
in all three directions. The liquid crystal phase is poised in the vicinity of the transition 
to the liquid phase and accounts for its exquisite sensitivity. Any material that resides in 
a particular phase of matter exhibits the general properties characteristic of that specific 
phase and there are just a few essential ingredients, such as the symmetry of the atoms or 
molecules comprising the material and certain macroscopic parameters such as the pressure 
and temperature, that determine the relevant phase. 

Biomolecules, such as DNA and proteins, form the basis of life and exhibit simple forms 
such as a single, double or a triple helix and almost planar sheets assembled from zig-zag 
strands ^. The latter are also implicated in amyloid structure which play a role in diseases 
such as Alzheimer's and Type II diabetes The origin of these structures is now well un- 
derstood based on details of the constituent atoms and the quantum chemistry that governs 
their assembly . The common use of these modular structures by nature begs for 

a simple unified explanation for their ubiquity. Here we show that not only single, double 
and triple helices but also planar sheets made up of biaxial strands are natural forms in 
a conceptually novel, marginally compact phase of matter of a flexible tube, the simplest 
description of a chain molecule that incorporates the correct symmetry. Remarkably, this 
phase of matter is analogous to a liquid crystalline phase but for chain molecules and is as- 
sembled from emergent anisotropic building blocks. Our work provides a unified description, 
which transcends chemical details, of the structural motifs of biomolecules. We elucidate the 
role played by discreteness in promoting the creation of biaxial strands through spontaneous 
symmetry breaking. An important consequence of our work is that it suggests that physical 
scientists and engineers who wish to build nifty machines akin to proteins would do well to 
design their devices so that they are poised in this phase of matter with all its advantages. 
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The fluid and crystalline phases of ordinary matter are well described by a simple model of 
a collection of beads or hard spheres A hard sphere can be thought of as a point, a zero 
dimensional object, in space with an excluded volume region obtained by symmetrically 
inflating it to a size equal to its radius. The packing of spheres is a classic optimization 
problem ^] with a long and venerable history and many important applications. There are 
a large number of important synthetic materials, such as plastic, rubber, gels, and textile 
fibres, comprised of polymer molecules [lo|. Life is also based on chain molecules such as 
DNA and proteins. The generalization of the hard sphere to a one dimensional manifold 
consists of taking a curve and symmetrically inflating it to form a flexible tube of thickness 
A characterized by uniaxial symmetry (Figure 1). We will show that the tube paradigm 
provides a unifled and natural explanation for helical forms and sheets in biomolecules. 



II. RESULTS AND DISCUSSION 



Buried area of tubes 



We begin by describing the discrete version of a tube of thickness A represented by a chain 
of coins, whose planes coincide with equally spaced circular cross-sections of the tube. The 



self- avoidance of the tube is implemented by the three-body prescription jLl|, ll2j described 



in the caption of Figure 1. A classic way to take into account the solvophobic effect is to 
introduce an attractive pairwise interaction between the coin centers with an interaction 
range Rq. In [isl, [l4], it was shown that, when A ~ i?o, a short tube had relatively few low 
energy conformations compared to the generic compact phase (A ^ i?o) and the swollen 
phase (A ^ Rq). Structures in this novel marginally compact phase were found to be 
constructed from two building blocks, helices and zig-zag strands, and able to possess liquid 
crystal like sensitivity because of being poised in the vicinity of a transition to the swollen 
phase. 

In the continuum limit, two-body interactions are necessarily singular because there is a 
continuum of pairs, close by along the chain, that are within the interaction range A 
singularity-free formulation of the attractive interaction follows from the following physical 
situation. Let us suppose that the tube is immersed in a poor solvent whose molecules 
are approximated by spherical balls of radius R. For any given tube conflguration, there 
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are regions of the tube surface that the solvent molecules can come in contact with and 
other regions that are unaccessible. The latter constitutes the buried surface of the tube 
configuration. The bigger the radius of the solvent molecule radius, the larger is the buried 
surface. We will show that an interaction based on the buried area is sufficient to lead to 
ground state tube conformations in the marginally compact phase with a variety of secondary 
motifs. 

The simplest potential for the solvophobic interaction is given by 

no{{r{s)}) = -f,,,i^B{R), (1) 

where r(s) defines the smooth curve corresponding to the tube axis, s is the arc-length, 
T,b{R) is the buried area in presence of solvent molecules of radius R and KsoI is an effective 
interaction strength which we set equal to one without loss of generality. An analytical 
derivation is given in Methods. 

The interaction as given by Eq.([T]) describes a tube with a uniform solvophobicity. As 
discussed below, for proteins it is more appropriate to introduce a mixed solvophobicity 
tube. In its simplest version this tube has two types of surface regions, each characterized 
by a different degree of solvophobicity, as described by Eq. iQ in Methods. 



Uniform solvophobic tubes 

Figure 2 (a-h) depicts the conformations adopted by short tubes subject to compaction. 



Our results are obtained by maximizing the buried surface area [15|, ll6|, ll7[ of the tube 
(see eqs.([T]) and ([H]))- Such an optimization requirement is generically encountered when a 
tube shows a higher affinity to itself than to a solvent, e.g. in poor solvent conditions flo| . 
Strikingly, the conformations of choice are single, double and triple helices, all characterized 
by chirality and adopted by nature in the context of biomolecules such as proteins and DNA. 
It is remarkable that the shapes of close packed single and double helices adopted by flexible 
tubes match those of a-helices jisl, 13] and the DNA double-helix 20], respectively. At 
somewhat higher temperatures, one obtains the conformation in Figure 2 (h), as a result 
of the interplay between the simultaneous maximization of both the buried area and the 
entropy, comprising almost parallel elongated tube segments. Figure 3 shows an optimal 
arrangement of several segments of continuous tubes arranged in a hexagonal array. (Such 
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an arrangement is called an Abrikosov flux-lattice in the fleld of superconductivity 21|.) 
One would expect that the hehcal state of a few tubes would be supplanted by such an 
ordering when many tubes are packed together. 



Mixed solvophobic tubes 

We turn now to a consideration of two distinct mechanisms that promote sheet forma- 
tion within the context of the tube picture. The flrst mechanism is directly inspired from 
the observation that the side-chains of amino acids stick out in a direction approximately 
opposite to the bending direction of the protein backbone 22| yielding an effectively mixed 



solvophobic tube. In other words, certain parts of the solvophobic tube, determined by 
the instantaneous tube conformation, are already protected from the solvent by the side 
chains whereas the rest of the tube needs to shield itself from the solvent by means of the 
compaction process. The structures (i-1) in Figure 2, obtained by minimizing the energy 
given by Eq.Q, are optimal conformations for a single tube (i) and for multiple tubes at 
low temperature (j) and at a higher temperature (k,l). Our studies have been carried out 
at c = 5 which corresponds to the region P being solvophilic and the region H being solvo- 
phobic. The resulting sheet structure is characterized by planarity as well as strands that 
zigzag normal to the plane of the sheet, as observed in real protein structures. 

The second mechanism is more subtle and does not invoke mixed solvophobicity but in- 
stead arises from the consideration of a discrete version of the tube as described above. In- 
stead of maximizing the buried surface area of the continuous tube, we now seek to maximize 
the number of pairwise contacts between non-consecutive coin centers within a prescribed 
mutual distance of the order of the tube thickness in order to be within the marginally 



compact phase 13|, iM]- The optimal packing for the discrete case at its edge of compaction 
is shown in Figure 4 - one obtains a planar arrangement of chains which zigzag within the 
plane. In the continuum, one retains the uniaxial anisotropy characteristic of a tube whereas 
in the discrete case, the symmetry between the two directions perpendicular to the principal 
strand direction is spontaneously broken (in the mixed solvophobic tube, this symmetry is 
broken overtly). Strikingly, the out-of-plane zigzag pattern shown in Figure 2 for continuous 
tubes is realized in protein /3-sheets when viewed in the representation (see Figure 5a), 
whereas the in-plane zigzag pattern of the discrete case shown in Figure 4 is obtained in a 
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different representation with interaction centers on both the N — H and C — O bonds (see 
Figure 5b). 



Conclusions 



Our work suggests that protein native state structures occupy a novel phase of matter 
corresponding to that of compact conformations of a flexible tube. This marginally compact 
phase is analogous in several respects to the liquid crystal phase but this time for chain 
molecules. The liquid crystal phase is exquisitely sensitive to perturbations because it is 
poised close to the transition to the liquid phase. Likewise, protein structures are able to 
facilitate the various range of functions that proteins perform in the living cell because a 
tube at the edge of compaction is in the vicinity of a swollen phase in which the attractive 
potential is no longer operational. And just as liquid crystals are made up of anisotropic 
molecules, protein native state structures are made up of emergent anisotropic building 
blocks - uniaxial helices and almost planar sheets comprised of biaxial strands. It is tempting 
to speculate that nature, through evolution and natural selection, has been able to exploit 
the marginally compact phase of flexible tubes under the constraints of quantum chemistry 
governing covalent and hydrogen bonds. 



III. METHODS 



We derive an analytical expression for the buried area of a tube of length L {0 < s < L) 
and radius A. A generic point on the tube surface is given by 

u(s,^) = r(s) + A(n(s)cos^ + b(s)sin^), (2) 

where n(s) and b(s) are the normal and binormal vectors at position s respectively, and 9 is 
an azimuthal angle running from to 27r 
, where 



23| . The surface element is given by J g{s^ 0) dsdO 



\ds ) ds ae 



(3) 



du du ( du \ 

de ds \de J 

Note that d'^r/ds^ = Kn(s); R^is) = is the local radius of curvature; and dh/ds = 

— rn(s), where r is the torsion [24]. One obtains 

g = A\l- AKCosef. (4) 



For the tube Rc{s) — 1/k{s) > A, Vs, thus 1 — Akcos^ > 0, and 

^= A(l- A«;cos^). (5) 

The total area of the tube is 

E = / de dsy/^ = 27rAL. (6) 
^0 Jo 

The buried surface is determined by the inequahty 

Br{s,9) =min|r(s) + (A + i?)(n(s)cos^ + b(s)sin^)-r(s')| <A + R, (7) 

s' 

yielding an expression for the buried area: 

EB(i?) = A f^ds f^^ dell - cos e] -eiA + R- BR(s,e)], (8) 
Jo Jo \ Rc{s) J 

where Q{x) is equal to 1 if x > and otherwise. 

The simplest version of a mixed solvophobicity tube has two types of surface regions, 
each characterized by a different degree of solvophobicity. We denote these regions as P (for 
solvophilic) and H (for solvophobic) respectively. We consider the generalized Hamiltonian 

nc({r{s)}) = -A ds r de ( 1 - -^cose] ■ {e[A + r- Br{s, e)] - (9) 

Jo JO \ Rc[s) ) 

-c e[^i - Itt - e\] • (1 - e[A + - Br{s, em, 

where defines half the angular width of region P centered around e — tt and c is a 

measure of the coupling between this region and the solvent. The case c = corresponds to 
the uniform tube described previously. 
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FIG. 1: Sketch of a hard sphere and a tube. The self-avoidance of an ensemble of hard spheres, 
each of radius A, can be ensured by considering all pairs of spheres and requiring that none of the 
distances between the sphere centers is less than 2A. The self-avoidance of a tube of thickness A 
can be enforced through a suitable three-body potential [ill . \\\ . We denote the tube axis by a 
smooth curve, r(s), where the arc-length s satisfies < s < L and L is the total length of the tube. 
For a tube one considers all triplets of points rj = r(sj),i = 1,2,3 on the tube axis and draws 
circles through them and requires that none of the radii r(ri, r2, ra) of these circles is less than the 
tube radius A. 
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FIG. 2: Optimal conformations of tubes subject to compaction. In our simulations, we considered 
a discretized representation with N segments separated by a distance b = A/2, where A is the tube 
radius. The continuum limit is obtained when 6 << A - we have verified that our results are sub- 
stantially the same on reducing the value of b down to A/3. The conformations are obtained using 
Metropolis Monte Carlo simulations by annealing or by long simulations at constant temperature. 
The simulations are performed with standard pivot and crank-shaft move sets [25^] . For systems of 
multiple chains, the tubes are placed inside a hard-wall cubic box of side of 40A - we have verified 
that the walls of the box do not influence the conformations shown, (a-e) Conformations of single 
solvophobic tubes (c = 0) of various lengths and for different solvent molecule radius R that max- 
imize the buried area (Eq. 8): a) iV = 20 and R = 0.1 A, b) iV = 20 and R = 0.2 A, c) iV = 30 and 
R = 0.1 A, d) iV = 40 and R = 0.1 A, e) iV = 50 and R = 0.2 A; (f-h) Optimal conformations of 
multiple solvophobic tubes (c = 0) of length = 20 and for R = O.IA obtained in long simulations 
at constant temperatures: f) two tubes at a low temperature, g) three tubes at a low temperature, 
h) four tubes at an intermediate temperature; (i-1) Conformations of mixed solvophobicity tubes 
(c = 5) that minimize the energy (Eq. 9). R = A/2 for all cases, i) A single helix of length = 30 
and 6i = 15" obtained by slow annealing (one obtains the same conformation for 0i = 30° or 45°). 
j) A stack of 4 helices of length = 15 and 6i = 45° obtained by slow annealing, k-1) Two views 
of a planar sheet arrangement of 5 chains of length = 15 and 6i = 30° obtained in a constant 
temperature simulation run at T = 0.4. 
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FIG. 3: Optimal arrangement of several segments of continuous tubes arranged in an Abrikosov 
flux-lattice like state 211] with straight tube segments parallel to each other. In the plane orthogonal 



to the tube axes, the tube cross sections are arranged in a hexagonal array. 
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FIG. 4: Sketch of optimally packed short segments of three tubes (dark lines) obtained from 
Metropolis Monte Carlo annealing simulations at the edge of compaction. The self-avoidance of 
a discrete tube (defined through a set of N points along the discretized tube axis {ri,r2, . . . ,r7v} 
with unit spacing (b = 1) between consecutive points) is enforced through the three-body potential 
defined in the caption of Figure 1. The number of pairwise contacts between non-consecutive beads 
is maximized. Any two such beads are forbidden to come closer than 1.1 units and are defined to 
form a contact when they come closer than 1.6 units {Ri = 1.6) (these numbers have been selected 
to conform to the known length scales associated with real proteins). For convenience, the three 
tube segments are placed inside a hard-wall spherical box of radius 9 units - the conformations 
shown are not affected by the presence of the walls. Our simulations were performed with standard 
pivot, crank-shaft, and tail slithering move sets. Random translations of one of the chains were also 
attempted. All tubes have a radius A = 1.1. In order to minimize edge effects, the tubes were of 
different lengths and the first and last points, which are not shown in the figure, were not allowed 
to form any contact. One obtains 8 pairwise contacts for both the ground state arrangements 
shown in the figure. Also drawn are some of the circles of radius A going through several local and 
non-local triplets. 
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(a) 



(&) 



FIG. 5: Parallel /3-sheets from a structural model of CA150.WW2 protofilaments forming amy- 
loid fibrils (Protein Data Bank 2y] ID code 2NNT). The model is based on distance constraints 



obtained by means of magic angle spinning (MAS) NMR spectroscopy [27[]. (a) Side view of the 



two /3-sheets (in yellow) forming the hairpin structure of the whole protofilament. backbone 



representation is employed 



28( 1 . (b) Top view of the /?-sheet included in the rectangular box in (a) . 



The representation used in (b) employs virtual interaction centers based on main backbone atom 
positions. Blue (red) spheres are placed in the middle of the N-H (C-0) bonds and lie approx- 
imately in the same plane. Thick black lines are drawn to connect interaction centers along the 
same /3-strand. Thin green lines are drawn to represent interactions (i.e. virtual hydrogen bonds) 
between neighboring strands. 
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